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1 Introduction 

The fundamental quantities one requires in the calculation of scattering pro¬ 
cesses involving hadronic particles are the parton distributions. Global fits 
[0-0 use all available data, largely structure functions, and the most up-to- 
date QCD calculations, currently NLO-in-Q;s(Q^), to best determine these 
parton distributions and their consequences. In the global fits input partons 
are parameterized as, e.g. 

xf{x, Ql) = (1 - -I- 'yx)x^ 

at some low scale Qq ^ 1 — 5GeV^, and evolved upwards using NLO DGLAP 
equations. Perturbation theory should be valid if > 2GeV^, and hence 
one fits data for scales above 2 — 5GeV^, and this cut should also remove the 
influence of higher twists, i.e. power-suppressed contributions. 

In principle there are many different parton distributions - all quarks 
and antiquarks and the gluons. However, rric, rnt ^ Aqcd (and top does 
not usually contribute), so the heavy parton distributions are determined 
perturbatively. Also we assume s = s, and that isospin symmetry holds, i.e. 
p ^ n leads to d{x) u{x) and u(x) —> d(x). This leaves 6 independent 
combinations. Relating s to l/2{u + d) we have the independent distributions 

uv = u — u, dy = d — d, sea = 2 = 1 = (m-I- d-b s), d — u, g. 

It is also convenient to define S = uv + dv + sea + {c +c) + {b + b). There are 
then various sum rules constraining parton inputs and which are conserved 
by evolution order by order in as, i.e. the number of up and down valence 
quarks and the momentum carried by partons (the latter being an important 
constraint on the gluon which is only probed indirectly), 

/ xE{x) -b xg{x) dx = 1. 

Jo 

When extracting partons one needs to consider that not only are there 
6 independent combinations, but there is also a wide distribution of x from 
0.75 to 0.00003. One needs many different types of experiment for a full 
determination. The sets of data usually used are: HI and ZEUS F^iXjQ"^) 
data ^,0 which covers small x and a wide range of Q^] E665 F 2 ’‘^{x,Q‘^) 
data [Q at medium x] BCDMS and SLAG F^^'^ix, Q^) data at large 
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x; NMC F 2 ’‘^{x,Q^) at medium and large x] CCFR F 2 ^‘'^^{x,Q^) and 
at large x which probe the singlet and valence quarks 

E605 pN 


Fg data 

independently; ZEUS and HI Fjdata [|l5yi6 


18 


> pfi + 

which 


X constraining the large x sea; E866 Drell-Yan asymmetry 
determines d — u; CDF W-asymmetry data which constrains the u/d 
ratio at large x; CDF and DO inclusive jet data [ ^OH^ which tie down the 
high X gluon; and NuTev Dimuon data which constrain the strange sea. 

The quality of the fit to data is usually determined by the ■ There are 
various alternatives for calculating this. The simplest is adding statistical and 
systematic errors in quadrature. This ignores the correlations between data 
points, but it is the only available method for many data sets. In principle it 
should be improved upon, but in practice sometimes works perfectly well. 

A more sophisticated approach is to use the covariance matrix 


Cij — Sija^ 


k=l 


where k runs over each source of correlated systematic error and are the 
correlation coefficients. The is defined by 

N N 

- F,(a)), 

i=l j=\ 


where TV is the number of data points, Di is the measurement and Ti{a) is 
the theoretical prediction depending on parton input parameters a. Unfortu¬ 
nately, this relies on inverting large matrices. 

One can also minimize with respect to the systematic errors, i.e. incorpo¬ 
rate the systematic errors into the theory prediction 

n 

s) — -j- ^ ^ SfcZiifc, 

where Aik is the one-sigma correlated error for point i from source k. In this 
case the is defined by 


X 


2 



Di - Ma,s) 


2 n 




where the second term constrains the values of Sk- This allows the data to 
move en masse relative to the theory, but assumes the correlated systematic 
errors are Gaussian distributed. One can actually solve for each of the Sk 
analytically |^, simplifying greatly. This method is identical to the correla¬ 
tion matrix definition of x^ at the minimum, but it has the double advantage 
that smaller matrices need inverting and one sees explicitly the shift of data 
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relative to theory. However, one may ask whether Gaussian correlated er¬ 
rors are realistic and whether it is valid to move data to compensate for the 
shortcomings of theory. MRST find that for HERA data increments in 
using this method are much the same as for adding errors in quadrature, and 
data move towards theory [Q. However, for Tevatron jet data the correlated 
systematic errors dominate and must be incorporated properly. 

Once a decision about is made, the above procedure completely deter¬ 
mines parton distributions at present. The total fit is reasonably good and 
that for CTEQ6 is shown in Table 1 for the large data sets. The total 
= 1954/1811. For MRST The total = 2328/2097 - but the errors are 
treated differently, and different data sets and cuts are used. The same sort of 
conclusion is true for other global fits |^,||j|,^,0| (which use fewer data). How¬ 
ever, there are some areas where the theory perhaps needs to be improved, 
as we will discuss later. 

Table 1. Quality of fit to data for CTEQ6M 


Data Set 

no. of data 

X^ 

HI ep 

230 

228 

ZEUS ep 

229 

263 

BCDMS pp 

339 

378 

BCDMS pd 

251 

280 

NMC pp 

201 

305 

E605 (Drell-Yan) 

119 

95 

DO Jets 

90 

65 

CDF Jets 

33 

49 


2 Parton Uncertainties 

2.1 Hessian (Error Matrix) approach 

In this one defines the Hessian matrix H by 

- xLn = ^X^ = - a®)- 


H is related to the covariance matrix of the parameters by Uj (a) = Ayfi{H ^)ij, 
and one can use the standard formula for linear error propagation. 


(Z\F)2 



This has been employed to find partons with errors by Alekhin j|] and HI 
[|j (each with restricted data sets), as demonstrated in Fig. 1. 
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Fig. 1. HI determination of the gluon from their own data + BCDMS data with 
an emphasis on g{x,Q^) and as{Mz) in the fit 



Fig. 2. Results of CTEQ Hessian approach for the gluon uncertainty 

The simple method can be problematic with larger data sets and numbers 
of parameters due to extreme variations in Ax^ in different directions in 
parameter space. This is solved by finding and rescaling the eigenvectors of 
H (CTEQ |^ , p5| ,p[) leading to the diagonal form 

i 

The uncertainty on a physical quantity is then given by 
{AFf = - F{s\-^))\ 
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where and are PDF sets displaced along eigenvector directions by 
a given Ax^- Similar eigenvector parton sets have also been introduced by 
MRST [^. However, there is an art in choosing the “correct” Ax^ given the 
complication of the errors in the full ht Ideally Ax^ = 1, but this leads to 
unrealistic errors. CTEQ choose Ax^ ~ 100, which is perhaps conservative. 
MRST choose Ax'^ ~ 50. An example of results is shown in Fig. 2. 


ZEUS 



Fig. 3. Parton densities and their errors extracted by fits by ZEUS 

2.2 Offset method 

In this the best fit is obtained by minimizing 
^.2 

V <^i,unc / 

i.e. the best fit and parameters qq are obtained using only uncorrelated errors, 
forcing the theory to be close to unshifted data. The quality of the fit is 
estimated by adding errors in quadrature. The systematic errors on the are 
determined by letting each = ±1 and adding the deviation in quadrature, 
or equivalently by calculating 2 Hessian matrices 

Mii = -^ Vii = - 

daiduj duidsj 

and dehning covariance matrices 

Cstat = M-1 Csys = M-^VV^M-^ Ctot = Cstat + Csys, 

which is used in practice. This was used in early HI Q and ZEUS ^ fits. It 
is still used by ZEUS 0, as shown in Fig. 3, and is a conservative approach 
to systematic errors leading to a bigger uncertainty for a given Ax^- 
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2.3 Statistical Approach 

In principle one constructs an ensemble of distributions labelled by T each 
with probability P{{J-}), where one can incorporate the full information 
about measurements and their error correlations into the calculation of 
This is statistically correct, and does not rely on the approximation of linear 
propagation errors in calculating observables. However, it is inefficient, and 
in practice one generates Npdf {Npdf can be as low as 100) different distri¬ 
butions with unit weight but distributed according to [||. Then the 

mean no and deviation ao of an observable O are given by 

. ^pdf ^ ^pdf 

Currently this approach uses only proton DIS data sets in order to avoid 
complicated uncertainty issues, e.g. shadowing effects for nuclear targets, and 
also demands consistency between data sets. However, it is difficult to find 
many truly compatible DIS experiments, and consequently the Fermi2001 
partons are determined by only HI, BCDMS, and E665 data sets. They result 
in good predictions for many Tevatron cross-sections. However, the restricted 
data sets mean there is restricted information - data sets are deemed either 
perfect or, in the case of most of them, useless - leading to unusual values 
for some parameters, e.g. as(M|) = 0.112 ± 0.001 and a very hard dv[x) at 
high X (together these facilitate a good fit to Tevatron jets independent of the 
high-x gfuon). These partons would produce some extreme predictions. Nev¬ 
ertheless, the approach does demonstrate that the Gaussian approximation 
is often not good, and therefore highlights shortcomings in the methods out¬ 
lined in the previous sections. It is a very attractive, but ambitious large-scale 
project, still in need of some further development, in particular the inclusion 
of a wider variety of data. 

2.4 Lagrange Multiplier method 

This was first suggested by CTEQ and has been concentrated on by 
MRST . One performs the fit while constraining the value of some physical 
quantity, i.e. one minimizes 

tF(A, a) = xliobaM) + AP(a) 

for various values of A. This gives a set of best fits for particular values of the 
quantity F(a) without relying on the quadratic approximation for The 
uncertainty is then determined by deciding an allowed range of One can 
also easily check the variation in y^ for each of the experiments in the global 
fit and ascertain if the total Ay^ is coming specifically from one region, which 
might cause concern. In principle, this is superior to the Hessian approach, 
but it must be repeated for each physical process. 




Global Fits of Parton Distributions 


7 


2.5 Results 

I choose the cross-section for W and Higgs production at the Tevatron and 
LHC (for Mh = 115GeV) as examples. Using their fixed value of as(M|) = 
0.118 and Ax^ = 100 CTEQ obtain 

Zlcrvv (LHC) Ri ±4% Acrw(Tev) « ±5% Acth(LHC) rs ±5%. 

Using a slightly wider range of data, ~ 50 and asiM"^) = 0.119 MRST 
obtain 

AawiTev) ±1.2% Acrw(LHC) « ±2% 

AaniTev) ±4% AcrH(LHC) « ±2%. 

MRST also allow as{M'^) to be free. In this case Acjw is quite stable but 
A(7h almost doubles. Contours of variation in for predictions of these 
cross-sections are shown in Fig. 4. 




Fig. 4. X^-plot for W and Higgs production at the Tevatron (left) and LHC (right) 
with as free and fixed at as = 0.119 

Hence, the estimation of uncertainties due to experimental errors has 
many different approaches and different types and amount of data actually 
fit. Overall the uncertainty from this source is rather small - only more 
than a few % for quantities determined by the high x gluon and very high 
X down quark. However, different approaches can lead to rather different 
central values, as illustrated for determinations of as{M^) in Table 2. This 
shows that there are other matters to consider. As well as the experimental 
errors on data we need to determine the effect of assumptions made about 
the fit, e.g. cuts made on the data, the data sets fit, the parameterization 
for input sets, the form of the strange sea, etc.. Many of these can be as 
important as the errors on the data used (or more so). This is demonstrated 
in Fig. 5 which shows the predictions for W and Higgs production at the 
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Tevatron from MRST2001 and CTEQ6. As well as the consequences of these 
assumptions we must also consider the related problem of theoretical errors. 

Table 2. Values of aa{M^) and its error from different NLO QCD fits 


Group 

Ax" 

as{Ml) 

CTEQ6 

Ax" = 100 

0.1165 ±0.0065(exp) 

ZEUS 

^xlff = 50 

0.1166 ± 0.0049(exp) ± 0.0018(modeO ±0.004(t/ieorj/) 

MRSTOl 

Ax" = 20 

0.1190 ± 0.002(e2;p) ± 0.003(t/ieori/) 

HI 

Ax" = l 

0.115 ±0.0017(e2;p)+ ° {model) ±0m^{theory) 

Alekhin 

Ax" = l 

0.1171 ± 0.0015(exp) ± 0.0033(f/ieorj/) 

GKK 

^xlff = 1 

0.112 ±0.001(exp) 



Fig. 5. x^“Plot for W and Higgs production at the Tevatron with as free. The 
predictions from CTEQ6 and for fits with only data with x > Xcut retained are 
marked 

3 Theoretical errors 

3.1 Problems in the fit 

Theoretical errors are indicated by some regions where the theory perhaps 
needs to be improved to fit the data better. There is a reasonably good fit to 
HERA data, but there are some problems at the highest at moderate a;, i.e. 
in dF 2 /dhiQ'^^ as seen for MRST and CTEQ in Fig. 6. Also the data require 
the gluon to be valencelike or negative at small x at low e.g. the ZEUS 
gluon in Fig. 7, leading to Fl{x, Q^) being negative at the smallest x,Q'^ Q. 
However, it is not just the low x-low data that require this negative gluon. 
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MRST(2001) NLO fit, x= 0.008 - 0.032 



q’ (GeV") 



10 100 1000 10000 100000. 
[GeV^] 


Fig. 6. Comparison of MRST(2001) F 2 {x,Q^) with HERA, NMC and E665 data 
(left) and CTEQ6 F 2 {x,Q^) with HI data right 


The moderate x data need lots of gluon to get a reasonable dF 2 /dlnQ^ and 
the Tevatron jets need a large high x gluon, and this must be compensated for 
elsewhere. In general MRST find that it is difficult to reconcile the fit to jets 
and to the rest of the data, and that different data compete over the gluon 
and The jet fit is better for CTEQ6 largely due to their different 

cuts on other data. Other fits do not include the Tevatron jets, but generally 
produce gluons largely incompatible with this data. 


3.2 Types of Theoretical Error, NNLO 

It is vital to consider theoretical errors. These include higher orders (NNLO), 
small X (a" ln"“^(l/a:)), large x (a”ln^"“^(I — x)) low (higher twist), 
etc.. Note that renormalization/factorization scale variation is not a reliable 
method of estimating these theoretical errors because of increasing logs at 
higher orders, e.g. at small x 


gg 




p2 

Q9 




ag(Af^)ln"-^(l/:r) 


99 


and scale variations of P^g, Pgg never give an indication of these logs. 

In order to investigate the true theoretical error we must consider some 
way of performing correct large and small x resummations, and/or use what 
we already know about NNLO. The coefficient functions are known at NNLO. 
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X 

Fig. 7. Zeus gluon aud sea quark distributions at various values 

Singular limits x ^ 1, x ^ 0 are known for NNLO splitting functions as well 
as limited moments |^, and this has allowed approximate NNLO splitting 
functions to be devised which have been used in approximate global fits 
They improve the quality of fit very slightly (mainly at high x) and 
asiM"^) lowers from 0.119 to 0.1155. The gluon is smaller at NNLO at low 
X due to the positive NNLO quark-gluon splitting function. There is also a 
NNLO fit by Alekhin [^, with some differences - the gluon is not smaller, 
probably due to the absence of Tevatron jet data in the fit and to a very 
different definition of the NNLO charm contribution. There is agreement in 
the reduction of as{M'^) at NNLO, i.e. 0.1171 ^ 0.1143. 

Using these NNLO partons there is reasonable stability order by order for 
the (quark-dominated) W and Z cross-sections, as seen in Fig. 8. However, 
the change from NLO to NNLO is of order 4%, which is much bigger than 
the uncertainty at NLO due to experimental errors. Also, this fairly good 
convergence is largely guaranteed because the quarks are fit directly to data. 
There is greater danger in gluon dominated quantities, e.g. Fl(x,Q^), as 
shown in Fig. 9. Hence, the convergence from order to order is uncertain. 

3.3 Empirical approach 

We can estimate where theoretical errors may be important by adopting 
the empirical approach of investigating in detail the effect of cuts on the fit 
quality, i.e. we try varying the kinematic cuts on data. The procedure is to 
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Fig. 8. LO, NLO and NNLO predictions for W and Z cross-sections 




Fig. 9. LO, NLO and NNLO predictions for Fl{x, Q^) 
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change and/or Xcut, re-fit and see if the quality of the fit to the 

remaining data improves and/or the input parameters change dramatically. 
(This is similar to a previous suggestion in terms of data sets p^) One then 
continues until the quality of the fit and the partons stabilize 

For raising from 12.5GeV^ has no effect. When raising from 
2GeV^ in steps there is a slow, continuous and significant improvement for 
up to > 12GeV^ (631 data points cut) - suggesting that any corrections 
are probably higher orders not higher twist. The input gluon becomes slightly 
smaller at low x at each step (where one loses some of the lowest x data), 
and larger at high x. asiM"^) slowly decreases by about 0.0015. The fit 
improves for Tevatron jets and BCDMS data. Raising Xcut leads to continuous 
improvement with stability reached at a: = 0.005 (271 data points cut) with 
as(M|) ^ 0.118. There is an improvement in the fit to HERA, NMG and 
Tevatron jet data, and much reduced tension between the data sets. At each 
step the moderate x gluon becomes more positive, at the expense of the 
gluon below the cut becoming very negative and dF 2 {x,Q'^)/d\nQ^ being 
incorrect. However, higher orders could cure this in a quite plausible manner, 
e.g. adding higher order terms to the splitting functions 

D , 3.6d|/ln^(l/a:) ln^(l/a:)^ 

a; V 6 2 


4.3fV/d| /ln^(l/a;) 


gg 


6x 


6 


ln^(l/a;) ^ 


leaves the fit above x = 0.005 largely unchanged, but solves the problem 
below X = 0.005. (Saturation corrections seem to make the ht worse.) Hence, 
the cuts are suggestive of theoretical errors for small x and/or small Q^. 
Predictions for W and Higgs cross-sections at the Tevatron are still safe if 
Xcut = 0.005, since they do not sample partons at lower x, and change in 
a smooth manner as Xcut is lowered, due to the altered partons above Xcut, 
outside the limits set by experimental errors, as seen in Fig. 6. 


4 Conclusions 


One can perform global fits to all up-to-date data over a wide range of pa¬ 
rameter space, and there are various ways of looking at uncertainties due to 
errors on data alone. There is no totally preferred approach. The errors from 
this source are rather small - ~ 1 — 5% except in a few regions of parameter 
space and are similar using various approaches. The uncertainty from input 
assumptions e.g. cuts on data, parameterizations etc., are comparable and 
sometimes larger, which means one cannot believe one group’s errors. 

The quality of the fit is fairly good, but there are some slight problems. 
These imply that errors from higher orders/resummation are potentially large 
in some regions of parameter space, and due to correlations between partons 
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these affect all regions (the small x gluon influences the large x gluon). Cut¬ 
ting out low X and/or data allows a much-improved fit to the remaining 
data, and altered partons. Hence, for some processes theory is probably the 
dominant source of uncertainty at present and a systematic study is a priority. 
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